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ABSTRACT 



This paper uses diffeomorphometry methods to quantify the order in which statistically significant morphometric 
change occurs in three medial temporal lobe regions, the amygdala, entorhinal cortex (ERC), and hippocampus 
among subjects with symptomatic and preclinical Alzheimer's disease (AD). Magnetic resonance imaging scans 
were examined in subjects who were cognitively normal at baseline, some of whom subsequently developed clin- 
ical symptoms of AD. The images were mapped to a common template, using shape-based diffeomorphometry. The 
multidimensional shape markers indexed through the temporal lobe structures were modeled using a changepoint 
model with explicit parameters, specifying the number of years preceding clinical symptom onset. Our model as- 
sumes that the atrophy rate of a considered brain structure increases years before detectable symptoms. 
The results demonstrate that the atrophy changepoint in the ERC occurs first, indicating significant change 
8-10 years prior to onset, followed by the hippocampus, 2-4 years prior to onset, followed by the amygdala, 
3 years prior to onset. The ERC is significant bilaterally, in both our local and global measures, with estimates of 
ERC surface area loss of 2.4% (left side) and 1.6% (right side) annually. The same changepoint model for ERC volume 
gives 3.0% and 2.7% on the left and right sides, respectively. Understanding the order in which changes in the brain 
occur during preclinical AD may assist in the design of intervention trials aimed at slowing the evolution of the 
disease. 

© 2014 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license 
(http://creativecommons.Org/licenses/by-nc-nd/3.0/). 



1. Introduction 

Brain imaging and MRI studies have substantially advanced our 
knowledge of regional brain atrophy in Alzheimer's disease (AD). Mag- 
netic resonance imaging (MRI) measures are an indirect reflection of 
the neuronal injury that occurs in the brain as the AD pathophysiological 
process evolves. In the initial stages of AD, atrophy on MRI appears to 
have a predilection for the brain regions with heavy deposits of neurofi- 
brillary tangles (Braak and Braak, 1991; Arnold et al, 1991; Price and 
Morris, 1999). Consistent with this pattern, the volume of the entorhinal 
cortex, the hippocampus and other medial temporal lobe structures has 



Abbreviations: AD, Alzheimer's disease; MCI, mild cognitive impairment; ERC, entorhi- 
nal cortex; NIH, Clinical Center of the National Institutes of Health; NIA, National Institute 
on Aging; NIMH, National Institute for Mental Health; GPB, Geriatric Psychiatry Branch; 
SPGR, spoiled gradient echo; CDR, clinical dementia rating; FWER, family-wise error 
rate; ROI-LDDMM, region-of-interest large deformation diffeomorphic metric mapping; 
RSS, residual sum of squares; MMSE, mini-mental state exam; diffeomorphometry, 
study of shape using a metric on the diffeomorphic connections between structures. 
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been shown to discriminate between patients with AD dementia and con- 
trols, and between subjects with mild cognitive impairment (MCI) and 
controls, and to be associated with time to progress from MCI to AD de- 
mentia (Kantarci and Jack, 2003; Atiya et al, 2003). Longitudinal MRI 
data in cognitively normal individuals who have progressed to mild im- 
pairment (i.e., preclinical AD) is limited but suggests that volumetric mea- 
sures of medial temporal lobe regions may predict progression from 
normal cognition to mild impairment. Differences in atrophy rate of the 
entorhinal cortex (Jack et al, 2004; Miller et al, 2013a), the hippocampus 
or sub volumes of the hippocampus (Jack et al., 2004; Apostolova et al, 
2010) and ventricular volume (Carlson et al, 2008) have been demon- 
strated during preclinical AD. It has also been demonstrated that baseline 
measures of the hippocampus and amygdala in controls predict subse- 
quent development of MCI (den Heijer et al, 2006), with hippocampus 
shape differences being reported among controls who subsequently 
developed cognitive impairment (Rusinek et al, 2003; Chiang et al., 
2009; Csernansky et al., 2005; den Heijer et al, 2006; Thambisetty et al., 
2010). 

Methods of statistical analysis based on diffeomorphometry for 
studying normal age-related changes in subcortical nuclei and in a 
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number of other diseases have already been enlightening (Qiu et al, 
2010; Qiu et al, 2009a; Csernansky et al, 1998; Csernansky et al, 
2000; Wang et al, 2007; Ashburner et al, 2003; Thompson et al, 
2004; Younes et al, 2012; Tang et al, 2013). This study follows our pre- 
vious study in the same subject population (Miller et al, 2013a) in 
which we used diffeomorphometry to measure subregional atrophy in 
three structures of the temporal lobe, entorhinal cortex (ERC), hippo- 
campus and amygdala and demonstrated statistically significant chang- 
es in brain structures during preclinical AD. These prior results are 
consistent with histopathological findings that suggest that these re- 
gions are affected during the earliest phase of AD (Arriagada et al., 
1992; Herzog and Kemper, 1980; Scott et al, 1991; Scott et al, 1992; 
Tsuchiya and Kosaka, 1990). This approach allows for a fine-scale, 
high-dimensional, analysis of non-uniform change patterns in the struc- 
tures, and complements coarser low-dimensional measures, like struc- 
ture volume. 

The study described here focuses on the temporal order of atrophy of 
the same three structures. The diffeomorphometry pipeline follows our 
general pattern (Younes et al., 2012; Tang et al., 2013; Miller et al., 
2013a), first involving an initial coarse rigid alignment phase followed 
by a high-dimensional template-matching phase. This registers all 
shape morphometry to a single template coordinate system, which is 
centered to the population, producing a high-dimensional representa- 
tion of the data in a coordinate system in which each coordinate is di- 
rectly comparable across the population. The statistical analysis uses 
multivariate models including a nonlinear component defining a 
changepoint in atrophy over time, with significance assessed while tak- 
ing multiple comparisons into account. The introduction of the 
changepoint model offers the opportunity to quantify the temporal or- 
dering of morphometric changes among these temporal lobe structures 
in preclinical AD (i.e., the ERC, hippocampus and amygdala), which we 
have already found to be discriminating in these groups of temporal 
lobe structures in preclinical AD (Miller et al, 2013a). No modeling of 
the order in years preceding clinical onset has yet been explicitly 
modeled or demonstrated to our knowledge. 



2. Subjects and data acquisition 

2.1. Study design 

The overall study (known as the BIOCARD study), is a longitudinal 
characterization of individuals funded jointly by the National Institutes 
on Aging (NIA) and Mental Health (NIMH). All BIOCARD subjects were 
cognitively normal when recruited with mean age at baseline of 
57.1 years. Scans were acquired during the period 1995-2005. A total 
of 805 scans have been collected during the 10-year period. The partic- 
ipants have now been followed for up to 18 years. 

A total of 354 individuals were initially enrolled in the study. Re- 
cruitment was conducted by the staff of the Geriatric Psychiatry branch 
of the Intramural Program of the NIMH, beginning in 1 995 and ending in 
2005. Subjects were recruited via printed advertisements, articles in 
local or national media, informational lectures, or word-of-mouth. The 
study was designed to recruit and follow a cohort of cognitively normal 
individuals who were primarily in middle age. By design, approximately 
three-quarters of the participants had a first-degree relative with de- 
mentia of the Alzheimer type. The overarching goal was to identify var- 
iables among cognitively normal individuals that could predict the 
subsequent development of mild to moderate symptoms of AD. Subjects 
were administered a comprehensive neuropsychological battery annu- 
ally. MRI scans, cerebrospinal fluid (CSF), and blood specimens were ob- 
tained every 2 years. The study was initiated at the Clinical Center of the 
National Institutes of Health (NIH) in 1995, and was stopped in 2005. In 
2009, our research team was funded to re-establish the cohort, continue 
the annual clinical and cognitive assessments, collect blood, and evalu- 
ate the previously acquired MRI scans, CSF and blood specimens. 



At baseline, all participants completed a comprehensive evaluation 
at the NIH. This evaluation consisted of a physical and neurological ex- 
amination, an electrocardiogram, standard laboratory studies (e.g., 
complete blood count, vitamin B12, thyroid function), and neuropsy- 
chological testing. Individuals were excluded from participation if they 
were cognitively impaired, as determined by cognitive testing, or had 
significant medical problems such as severe cerebrovascular disease, 
epilepsy or alcohol or drug abuse. Five subjects did not meet the entry 
criteria and were excluded at baseline, leaving a total of 349 participants 
followed over time. 

2.2. MRI assessments 

MRI scans were obtained on 335 participants at baseline. An addi- 
tional 470 scans were obtained in subsequent years for a total of 805 
scans. The mean interval between scan acquisitions on follow-up was 
2.02 years. The MRI scans acquired at the NIH were obtained using a 
standard multi-modal protocol using GE 1.5 T scanner. The scanning 
protocol included localizer scans, axial FSE sequence (TR = 4250 ms, 
TE = 108 ms, FOV = 512 x 512, thickness/gap = 5.0/0.0 mm, flip 
angle = 90°, 28 slices), axial FLAIR sequence (TR = 9002 ms, TE = 
157.5 ms, FOV = 256 x 256, thickness/gap = 5.0/0.0 mm, flip 
angle = 90°, 28 slices), coronal SPGR (spoiled gradient echo) sequence 
(TR = 24 ms, TE = 2 ms, FOV = 256 x 256, thickness/gap = 2.0/ 
0.0 mm, flip angle = 20°, 124 slices), sagittal SPGR sequence (TR = 
24 ms, TE = 3 ms, FOV = 256 x 256, thickness/gap 1.5/0.0 mm, flip 
angle = 45°, 124 slices). 

2.3. Clinical and cognitive assessments 

The clinical and cognitive assessments of the participants have 
been described elsewhere (Soldan et al, 2013). The cognitive assess- 
ment consisted of a neuropsychological battery covering all major 
cognitive domains (i.e., memory, executive function, language, 
spatial ability, attention and processing speed). A clinical assessment 
was also conducted annually. Since the study has been conducted 
by the current research team, this has included the following: a 
physical and neurological examination, record of medication use, be- 
havioral and mood assessments (Cummings et al., 1994; Yesavage 
et al, 1982), family history of dementia, history of symptom onset, 
and a clinical dementia rating (CDR), based on a semi-structured in- 
terview (Hughes et al, 1982; Morris, 1993). The clinical assessments 
given at the NIH covered similar domains. The diagnostic procedures 
in this study are comparable to those used in the Alzheimer's disease 
research centers program, funded by the National Institute on Aging. 
This involves a two-step process by which a decision is first made 
about whether the subject is normal, mildly impaired or demented 
(based on the clinical history, medical, neurologic and psychiatric 
evaluations and the cognitive testing), and then (if the subject 
is judged not to be normal) the likely cause(s) of the cognitive im- 
pairment is determined. This same diagnostic process was applied 
retrospectively to participants who had become cognitively im- 
paired while the study was being conducted at the NIH, but who 
(by the time the study had been re-established) were either 
moderate-to-severely impaired or were no longer living. It should 
be noted that the estimated age-of-onset of clinical symptoms, 
which is the primary outcome in these analyses, was established on 
the basis of clinical information elicited during the CDR interview 
by the clinician who evaluated the subject (or on the basis of clinical 
notes in the record), and re-confirmed during the consensus 
conference. 

2.4. MRI scans available for analysis 

Some subjects were removed from the analysis for uncertain diag- 
nostic (impairment not categorized as MCI) and several scans had to 
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be removed because they contained significant artifacts that prevented 
structure segmentation. The dataset used in this study includes MRI 
scans from 296 subjects, of which 230 individuals remained cognitively 
normal and 66 developed cognitive impairment and were diagnosed 
with MCI (of these, 17 subsequently progressed to AD dementia). The 
average number of scans per subject is 2.2 among the 230 controls 
and 2.5 among the 66 who progressed to MCI or AD dementia, for a 
total number of 661 scans overall used in this study. Table 1 summarizes 
demographic data for the subjects that were the focus of our analyses. 

It should be noted that our dataset differs from the one used in Miller 
et al. (2013a), which only used controls and preclinical subjects, in that 
all of the MRI scans available for the subjects described above were used 
in these analyses, since the goal was to estimate two slopes, one prior to 
the changepoint and one after the changepoint. Controls and subjects 
with preclinical AD (i.e., those who were cognitively normal at baseline 
but subsequently progressed to MCI or AD dementia) contribute to the 
estimation of the slope before the changepoint. Post-changepoint MRI 
scans contribute to the estimation of the slope after the changepoint, 
(which our data show is steeper than the pre-changepoint slope). For 
the post-changepoint slope to be reliably estimated, we used all of the 
post-changepoint MRI scans available, including those for subjects 
whose estimated age of onset was at or before 'baseline'. In the rest of 
this paper, we refer to the group of all patients who exhibit signs of cog- 
nitive impairment as the MCI/AD, or "cognitively impaired" group, 
whether this diagnosis was made during the first NIH study, or later 
as part of the extended study. 

3. Methods 

3 A. Surface based diffeomorphometry 

The approach used for the computation of shape coordinates via 
diffeomorphometry has been described in Miller et al. (2013a) and is re- 
peated here based on the application of geodesic positioning of morpho- 
metric markers to template coordinates (Miller et al, 2013b). The ROI- 
based diffeomorphometry (Qiu et al, 2010; Qiu et al, 2009a; 
Csernansky et al, 1998; Csernansky et al, 2000; Wang et al, 2007; 
Ashburner et al, 2003; Thompson et al., 2004; Younes et al, 2014; 
Miller et al., 2013a) for the entorhinal cortex, amygdala and hippocam- 
pus has three steps: (i) segmentation of the target structures, (ii) gener- 
ation of a single template coordinate system from the population of 
baseline scans, and (iii) mapping of the template onto each of the target 
segmented structures. The first step is the segmentation of the struc- 
tures based on the ROI large deformation diffeomorphic metric map- 
ping (ROI-LDDMM) procedure described previously (Csernansky et al., 
1998; Munn et al, 2007; Miller et al., 2013a), in which subcubes are an- 
alyzed for each structure with local landmarks used for initial coarse 
alignment before the high-dimensional LDDMM positioning is applied. 
This step was performed without information about the diagnostic sta- 
tus of the subjects. 

To generate shape biomarkers indexed to a common template coor- 
dinate system, we follow a previously published method (Younes et al., 
2014; Miller et al, 2013a) in which surfaces are rigidly aligned (rotation, 
translation), with right sub volumes flipped before alignment to ensure 
all structures could be compared. From rigidly aligned volumes, a tem- 
plate shape was generated from the entire population. For this, the ob- 
served surfaces are modeled as random deformations of the unknown 
to be estimated template (Ma et al., 2010), with the template coordinate 
system centered to the population generated by performing surface 



registration iteratively. The resulting templates for the amygdala, ento- 
rhinal cortex and hippocampus become the coordinate systems that are 
referenced for our p-values and our changepoint estimation. 

The high-dimensional shape descriptors indexed to template coordi- 
nates are generated by computing a diffeomorphic correspondence be- 
tween the template and each surface using LDDMM surface registration 
(Vaillant and Glaunes, 2005; Vaillant et al., 2007). The algorithm com- 
putes a smooth, invertible mapping <p of the triangulated surface tem- 
plate Stemp onto the target surfaces S ta rget- The mapping minimizes a 
fidelity criterion measuring the distance between the mapped template 
<P- Stemp an d tne target, defined as a norm between surfaces, and is pe- 
nalized by a geodesic transformation energy enforcing smoothness. 

Our shape marker measures the extent to which area is locally in- 
creased or reduced when mapping each subject surface to the template. 
This measure is directly computable from the registration of triangulat- 
ed surfaces as the ratio of the total area of the deformed triangles con- 
taining a given vertex to their original total area on the template, 
which we express in logarithmic scale. This results, for each subject, in 
one marker per vertex in the template surface. For computational effi- 
ciency, we sub-discretize this measure by averaging it over small seg- 
ments computed on the surface template. These segments are 
obtained by spectral clustering (Reuter, 2010) of the surface, a method 
which only relies on the surface geometry. This is achieved by comput- 
ing the first k eigenvectors of the Laplace-Beltrami operator 
associated with the surface, where k is the intended number of seg- 
ments, associating with each vertex a /(-dimensional vector formed 
with the values of the eigenvectors evaluated at this point. These 
vectors are then used in a standard /c-means algorithm to provide the 
k desired segments. The segments that were used in our analyses are 
provided in Fig. 1. Their number is adjusted so that they cover an area 
of 50 mm 2 on average, yielding 1 5 segments on the amygdala, 29 seg- 
ments on the hippocampus and 10 segments on the ERC Complemen- 
tary to the shape markers, our single-dimensional volumetric measure 
is the logarithm of the total volume of each structure. 

32. Statistical analysis via the morphometric changepoint model 

We analyze the shape markers with a model describing a change in a 
linear atrophy rate happening some number, A, of years before the esti- 
mated age of clinical onset, forming a morphometry changepoint before 
clinical symptom onset time. We assume linear models for the absolute 
volume and atrophy as a function of age, with a change of slope at 
changepoint. In mathematical terms, if a is the age and t is the time of 
clinical onset, the model takes the form a— ► a+ j6a before morphometry 
changepoint, a < t — A, and becomes a-> a+ j6a+ jS'(a— t— A) subse- 
quent to the changepoint. For the controls, the onset time of clinical 
symptom is (if it ever occurs) beyond the end of the study and the asso- 
ciated model retains the same linear rate over the considered time peri- 
od. This model, which is depicted in Fig. 2, can be related to the 
sigmoidal model introduced in Jack et al. (2013), in which biomarkers 
smoothly transition from a low-abnormality plateau to a high- 
abnormality one in an S-shaped curve. Under Jack's model, almost no 
patient from our dataset would have reached the high-abnormality pla- 
teau at scan times, and we therefore expect to observe them while they 
are either in the low-abnormality plateau, or in the transition phase, 
which justifies using a model with two regimes only. Note that our 
model does not include a second changepoint at clinical onset time. 
Since there is no reason for structure atrophy to switch regime at the 
very time the disease becomes detectable by clinical tests, we do not 



Table 1 

Participant characteristics at baseline and follow-up stratified by outcome status. 



Variable Control group (N = 230) MCI or AD before 2005 (N = 15) MCI or AD after 2005 (N = 51 ) Combined MCI/AD (N = 66) 

Age at entry, mean number of years (SD) 55.3 (9.8) 66(7.5) 62.4(11.4) 63.3 (10.8) 

Gender, females (%) 61% 53% 49% 50% 
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Fig. 1. Parcellation of the three structures using spectral clustering. The average segment 
size is 50 mm 2 . Amygdala: 15 segments, hippocampus: 29 segments, ERC: 10 segments. 

expect any direct causal effect of clinical onset on the morphometry. Of 
course, the two-piece model we consider allows for both structural 
changepoint and clinical onset time to coincide (i.e., A = 0). 

The analysis includes gender and log-intracranial volumes as covar- 
iates, and uses a random effect model to account for inter-subject versus 
intra-subject shape variation (see details in Appendix A). We test for the 
null hypothesis of constant-rate atrophy in the whole cohort versus the 
alternate changepoint model and compute statistics at each subregion 
of the segmented template surface, as delineated in Fig. 1, returning p- 
values corrected for multiple comparisons using permutation testing 
(Nichols and Hayasaka, 2003). Permutation testing applied to multiple 
hypotheses associated with subregions of the structures provides a 
unique p-value for rejecting the compound null that all individual 
nulls are true, i.e., that the onset model applies to none of the regions 
in the segmented structure. Assuming that the compound null is 
rejected, the method also provides an estimate of the set of significant 
regions, which is conservative in the sense that the probability of mak- 
ing a single false detection is small (5% in our results). This is usually re- 
ferred to as controlling the family-wise error rate (FWER) at the 5% 
level. In addition to the permutation testing, we used a resampling 
method (bootstrap) to assess the accuracy of the estimated onset 
time, allowing us to provide an indication of the standard deviation of 
our estimated changepoint. More details on our implementation of per- 
mutation testing and bootstrap validation are provided in Appendix A. 




Fig. 3. Log-likelihood as a function of changepoint for the amygdala (top), the hippocam- 
pus (middle) and the entorhinal cortex (bottom), with left structure on the leftcolumn 
and right on the right one. The left hippocampus is not significant and is only provided 
here for completeness. Likelihoods are averaged over all segments constituting the sur- 
face. The red curve is the likelihood obtained on the original sample, andthe blue one is av- 
eraged over bootstrap resampling. Both curves are offset so that their minimum value is 
zero, to simplify visualization. 



4. Results 

The curves in Fig. 3 show the log-likelihood calculation for the seg- 
ment morphometry markers as a function of the parameter A (the plot- 
ted log-likelihoods are obtained by maximizing over every model 
parameter except A). The changepoint estimator is the maximizer of 
this curve. The top rows in Fig. 3 show the left and right sides of the 
amygdala, the middle row shows those of the hippocampus, and the 
bottom row shows those of the ERC. The curves for the amygdala and 
the hippocampus are highest at around 2 years before clinical onset, 
while the likelihood for the ERC is largest at around 10 years before 
onset. The bootstrap-estimated standard deviations of these sample es- 
timates lie between 1 and 3 years as reported in Table 2. 



Min atrophy 



Max atrophy 




Fig. 2. Schematic illustration of the changepoint model. The atrophy regime changes at 
changepoint, several years before clinical onset. 



Table 2 also provides the p-values associated with the changepoint 
model for the ERC, amygdala, and hippocampus. The segment- 
averaged measures in the right hemisphere show significant differences 
between the groups as follows: the ERC is significant (p= 0.004) with a 
changepoint at around 8.5 years ( ±2.8). The hippocampus is marginally 
significantly different (p= 0.024), with a changepoint estimated at 3.6 
(±1.4) years prior to symptom onset, and the amygdala is strongly sig- 
nificant (p= 0.0015), with a changepoint at 2.7 (±1.9) years before 
onset. For the left hemisphere, the significant differences are as follows: 
(1) the entorhinal cortex demonstrates a significant difference 
(p< 0.0001 ) 8.6 (±1.5) years prior to symptom onset, and (2) the amyg- 
dala shows significant differences (p= 0.006) 2.5 (±1.4) years prior to 
symptom onset. The left hippocampus shows no significance. The vol- 
ume measures are significant for the amygdala (left: p= 0.003, right: 
p= 0.006), significant on the left side (p= 0.017) for the hippocampus, 
and not significant on the right (p= 0.27). Volume is significant for the 
ERC (left: p< 0.0001, right: p= 0.0007). 

Figs. 4-6 provide predicted atrophy curves with changepoint esti- 
mated from our model for all diseased subjects. The morphometric 
measure is the amount of surface area variation relative to the tem- 
plates, and is expressed in percentages, with 100% meaning no differ- 
ence from the template. To simplify visualization, all curves are 
plotted using the same covariate values (gender and intracranial vol- 
ume) for all subjects, taken equal to their population averages. Since 
there is one such curve per segment, they are also averaged over all seg- 
ments in each structure. The evolution starts with a nearly horizontal 
line, which coincides with the one associated with controls, then chang- 
es to a steeper slope at changepoint. The smoothness of the transition is 
due to averaging, since changepoints are not identical for each segment. 
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Table 2 

Differences in estimated onset of morphometric change in relationship to symptom onset for the amygdala, entorhinal cortex and hippocampus. The p-value, corrected for multiple com- 
parisons, is obtained for each structure via permutation tests, as described in the Methods section (the lowest detectable p-value is 10~ 4 ). The reported onset time for each structure is 
estimated via bootstrap (the standard deviation between parentheses measuring its accuracy). 



Side 




Amygdala volume 


Amygdala segment 


Hippocampus volume 


Hippocampus segment 


ERC volume 


ERC segment 


Left 


p-Value 


0.003 


0.006 


0.017 


0.13 


0.0001 


0.0001 




Avg. A (std.) 


2.8 (2.3) 


2.5 (1.4) 


3.1 (1.9) 


2.6(1.0) 


7.5 (2.5) 


8.6(1.5) 


Right 


p-Value 


0.006 


0.0015 


0.27 


0.024 


0.0007 


0.004 




Avg. A (std.) 


2.5 (3.3) 


2.7 (1.9) 


4.0 (3.8) 


3.6(1.4) 


9.9 (3.0) 


8.5 (2.8) 



The middle row in Figs. 4-6 plots these curves against subject age, while 
the bottom row provides the same curves shifted by clinical onset age, 
which enables comparison (the slopes are the same in both cases, 
even though the unshifted curves appear steeper; this is due to the 
rescaling of the horizontal axis for shifted curves). These plots also visu- 
alize the actual morphometric measure (still averaged over segments) 
for all cognitively impaired cases, represented as dots colored according 
to whether the scan is taken before changepoint, between changepoint 
and clinical onset, or after clinical onset. Since the estimated 
changepoint for the ERC is around 9 years before clinical symptom 
onset, only a few scans of preclinical cases predate it (the pre- 
changepoint line is primarily estimated based on control subjects). 
The transition appears more clearly with the amygdala for which the 
two regimes are observed, with a quasi-horizontal line followed by a de- 
creasing one after changepoint. The same goes for the right hippocam- 
pus. The top row of Figs. 4-6 represents the controls, together with 
the regression line that is associated with them (identical to the regime 
before changepoint). 

Fig. 7 maps the value of the surface atrophy rate (per year) that was 
detected on segments for which the onset model was found significant 
at 5% FWER. These atrophy rates are defined as 100(exp(/3) — 1 ) before 
changepoint and 100(exp(/3+ jS') — 1 ) after changepoint (see details in 
Appendix A). Table 3 provides the numerical values of the after- 
changepoint rates for each structure and side, with /3 and j6' replaced 
by their average values over all segments. In this table, we have applied 
the changepoint model to a reduced cohort containing subjects with 
three scans or more (81 controls and 30 cognitively impaired), to im- 
prove accuracy. For the left ERC, one finds 2.4% after changepoint and 
0.1% before. On the right side, the rate is 1.6% after and 0.2% before. 
For the amygdala, the left side gives 3.6% after and 0.1% before, and 
the right side is 4.6% after and 0.2% before. Finally, the left hippocampus 
shows a 1.2% atrophy rate after changepoint, and a 0.2% rate before. The 
right side gives 2.7% after and 0.2% before. The onset model, on the re- 
duced cohort, was found significant in all cases. Table 3 also provides at- 
rophy rates in terms of global volume, for which only the right 
hippocampus was found non-significant. Note that surface atrophy 
rates and volume atrophy rates are not directly comparable, although 
one can expect the former to be roughly 2/3 of the latter, which is con- 
sistent with what we observe in our results. 

5. Discussion 

The high-dimensional morphometry model indicates that, during 
preclinical AD, changes in the shape of the entorhinal cortex precede 
those in the hippocampus and the amygdala. In general, the high- 
dimensional diffeomorphometry-based markers are more significant 
than the volume markers in signaling the changepoint time. This is 
the first time to our knowledge that diffeomorphometry has been 
coupled to changepoint models to examine differences in atrophy 
rates among brain structures preceding clinical symptom onset. 

The finding of localized ERC change preceding the changepoint in 
the other two temporal lobe structures is consistent with previous 
MRI reports of ERC volume change during MCI. For example, a differen- 
tial rate of change in the ERC, compared to the hippocampus, has previ- 
ously been reported among individuals with MCI who subsequently 
progressed to AD dementia (Du et al, 2001, 2006; Dickerson et al., 




Fig. 4. Amygdala: left and right; blue: controls; green: before changepoint; magenta: be- 
fore clinical onset; black: after clinical onset. They -axis shows the surface area percentage 
around each vertex expressed relative to original template coordinates averaged over all 
segments. Top row: controls with their associated regression line. Middle row: cognitively 
impaired subjects. Third row: cognitively impaired subjects with time axis shifted by clin- 
ical onset. All values are corrected for random effects. 



2001; Jagust et al, 2006; Du et al, 2006; Jack et al., 2010). The finding 
of significant differences in morphometry change in the amygdala is 
consistent with morphometry reported in symptomatic AD, such as 
the volume (Poulin et al, 2011) and shape analysis (Cavedo et al, 
201 1 ; Qiu et al., 2009b). Alterations in the entorhinal cortex and hippo- 
campus are consistent with changes in memory performance reported 
during preclinical AD. Psychiatric symptomatology (such as apathy 
and irritability) have been reported in MCI cases that subsequently 
progressed to dementia, and it has been suggested that this may, in 
part, result from pathological changes in the amygdala. Further study 
is needed to directly link the ordering of these clinical features with re- 
spect to one another, and to the brain changes reported here. 

Evidence of morphometry changes occurring approximately 
10 years prior to symptom onset is also consistent with current hypoth- 
eses about the long prodromal phase of AD (Jack et al., 2013). This study 
adds additional evidence to support the hypothesis that there are signif- 
icant changes in the brain many years before symptom onset. It is hy- 
pothesized that early therapeutic intervention will have the best 
prospect for modifying disease progression and confer the greatest ben- 
efit to patients in the early stages of AD. Understanding the order in 
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Fig. 5. Hippocampus: left and right; blue: controls; green: before changepoint; magenta: 
before clinical onset; black: after clinical onset. They -axis shows the surface area percent- 
age around each vertex expressed relative to original template coordinates averaged over 
all segments. Top row: controls with their associated regression line. Middle row: cogni- 
tively impaired subjects. Third row: cognitively impaired subjects with time axis shifted 
by clinical onset. All values are corrected for random effects. 



which changes in the brain occur during preclinical AD may therefore 
assist in the design of intervention trials aimed at slowing the evolution 
of the disease. 

It is important to acknowledge the challenges of estimating age of 
symptom onset. While we used a semi-standardized instrument admin- 
istered to both the subject and an informant, with specific targeted 
questions to assess onset of symptoms, known as the Clinical Dementia 
Rating Scale (Morris et al, 1993), the determination was ultimately 
based on the judgment of skilled clinicians. It should be noted, however, 
that we have recently published data on CSF changes in relation to 
symptom onset in this cohort (Moghekar et al., 2013), suggesting that 
this measure is biologically meaningful. 

Several issues must be entertained in considering the validity of our 
statistical analyses. First, the number of available scans can vary among 
subjects (i.e., there are missing data) and ourp-value computation re- 
quires the assumption that this censoring is independent of the group 
variables. This is a reasonable assumption, since most subjects 
were not cognitively impaired when the scans were taken. This as- 
sumption is partially confirmed by the fact that there is no significant 
difference between the numbers of scans per subject in the two 
groups (p= 0.063), although this p-value is small enough to be a 
concern. If we limit the analysis to subjects that had two scans or 
more (resp. three scans or more), however, this p-value becomes 
p= 0.28 (resp. p= 0.59), and the results that are obtained in these 
cases show very few differences compared to the ones obtained for 
the main study. These results are included in Table 4 for comparison 
with Table 2. 

The second issue is whether the changepoint that is observed is ac- 
tually related to the disease, or whether it may be confounded with a 
normal changepoint occurring also in controls (so that our significant 
p-values would reflect the inadequacy of the model for controls, rather 




Fig. 6. ERC: left and right; blue: controls; green: before changepoint; magenta: before clin- 
ical onset; black: after clinical onset. They -axis shows the surface area percentage around 
each vertex expressed relative to original template coordinates averaged over all seg- 
ments. Top row: controls with their associated regression line. Middle row: cognitively 
impaired subjects. Third row: cognitively impaired subjects with time axis shifted by clin- 
ical onset. All values are corrected for random effects. 



than the existence of a disease-related changepoint). To examine this 
possibility, we extended our analysis to include a "normal changepoint", 
obtained by analyzing the control population, and used this changepoint 
to define an additional covariate in our previous analysis. Doing so did not 
impact our conclusion, yielding very minor changes in the obtained p- 
values (data not shown). 

It should be acknowledged that the boundaries between normal 
aging and the earliest symptomatic phase of AD are difficult to define. 
Studies in both animal models and humans demonstrate that there 
are significant cognitive and neurobiological changes with aging, in 
the absence of disease (see Samson and Barnes, 2013 for a review). Ad- 
ditionally, AD pathophysiological processes are evident in a subset of 
older adults who are cognitively normal (Sperling et al, 201 1 ). Studies, 
such as ours, in which cognitively normal individuals are followed for 
many years (with some remaining normal and some developing MCI 
and AD dementia) are one of the best ways currently available of 
disentangling changes related to aging from those that are a harbinger 
of disease. 




Fig. 7. Significant pattern of atrophy rate change. Left panel: seen from patient left side; 
right panel: seen from patient right side. The structures visualized in the left panel are 
the amygdala, hippocampus and ERC from left to right, and their order is reversed on 
the right panel. Color represents rate change coefficient, in percentages, after changepoint 
when significant. Deep blue color (0%) indicates non-significance for the corrected p - 
value. 
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Table 3 

Estimated atrophy rates per year, estimated over subjects with three scans or more (81 controls, 30 MCI/AD) for total volume and segments (averaged over all segments) on each structure 
(NS: non-significant). 



Side 




Amygdala volume 


Amygdala segments 


Hippocampus volume 


Hippocampus segments 


ERC volume 


ERC segments 


Left 


Before 


-0.3% 


-0.1% 


-0.35% 


-0.2% 


-0.3% 


-0.1% 




After 


-4.2% 


-3.6% 


-1.6% 


-1.2% 


-3.0% 


-2.4% 


Right 


Before 


-0.45% 


-0.2% 


-0.4% 


-0.2% 


-0.45% 


-0.2% 




After 


-5.0% 


-4.6% 


-1.7% (NS) 


-2.7% 


-2.7% 


-1.6% 
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Appendix A 

Al. Statistical details 

Al.l. Shape variables at each vertex 

Denote by a kj the age of subject k at their jth scan and by t k the time 
of clinical onset of the disease for the same subject. This time is only de- 
fined for subjects in the MCI/AD group, and we let g k be the associated 
indicator variable, equal to one for subjects in this group and to zero 
for controls. We also let c k denote the intracranial volume and d k the 
gender of subject k (zero for males, one for females). For the jth scan 
of subject k, shape markers are computed at each vertex coordinate 
and averaged over each template segment (shown in Fig. 1); they 
are denoted by yjj ., where q is the segment index. In the case of vol- 
ume, then the log-volume is used as a one-dimensional shape 
marker. 

A1.2. Changepoint onset model 

We now describe the changepoint model introducing the explicit 
role of the ordering parameter A in the atrophy model signaling in 
years the changepoint time preceding clinical symptom time. For this 



we use the changepoint model with linear atrophy turning on A years 
before clinical symptom 

ylj = a q+ P>q a kj + Pqgkfrkj-h + ^W%~k + 

in which H is the Heaviside function, such that H(t) = l if t> 0 and 
H(t) = 0 otherwise. The noise, e q k ., includes random effects and decom- 
poses as 

£ kj = Vk + Skj 

in which all variables r\ and £ with distinct indices are assumed to be 
independent, rjj[~N(0,pg) and ££.~JV(0, o 2 q ). The model parameters are 
a q i P q , jSg , A q , a q and pg (q indexing the shape coordinates). They are es- 
timated by maximum likelihood. 

Via the transition function, the model implies a first-order 
changepoint from atrophy rate j8 q to j8 g + $ q at age a = t k — A q , i.e., 
A q years before cognitive onset. Controls, for which g k = 0, remain in 
the first regime over the observation time. Recall that the vertex- 
indexed shape variables describe the relative value of surface area 
around a given point at logarithmic scale, where the template is the ref- 
erence. As a consequence, our model can be interpreted as 

• Before changepoint: 

(local surface area) = (template local surface area) x (correction 
based on covariates) x exp(j8 q - age) 

• After changepoint: 

(local surface area) = (template local surface area) x (correction 
based on covariates) x ((j8 q + 0 q ) • age — j8 q • (age at changepoint)) 

In other words, jS^O implies a multiplicative factor of exp(j8 q ) per 
year in surface area post changepoint. This is atrophy as soon as 0 q is 
negative, as found in our results. 

We can interpret t k — A 9 as a structural, or anatomical, phenotype 
onset time. In the following, we will test the null hypothesis j8 q = 0, 
which corresponds to the disease having no effect on the shape 
coordinate q. When this hypothesis is rejected with significant p-value 
(corrected for multiple comparisons), we can compute a 
consensus estimator, defined as the average of all estimated A 9 over co- 
ordinates q. 

A1.3. Estimation procedure 

The following computation is made independently for each shape 
coordinate q that we temporarily drop from the notation. We compute 
a maximum likelihood estimator for fixed values of A, taken from a dis- 
crete interval over the timeframe of interest, which is —2 to 13 years, 
with half-year increments. The computation with fixed A uses the EM 
algorithm (Dempster et al, 1977) to estimate the remaining parame- 
ters, treating the random effects rj k as hidden variables. The EM algo- 
rithm alternates, until stabilization, an E-step, in which the log- 
likelihood is averaged using the conditional distribution given observed 
variables, and an M-step, in which this conditionally averaged likeli- 
hood is maximized with respect to the model parameters. More details 
follow. 
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Table 4 

Differences in estimated onset of morphometric change in relationship with symptom onset for the amygdala, entorhinal cortex and hippocampus, for the cohort reduced to patients with 
two scans or more ( 136 controls and 46 impaired) and with three scans or more (81 controls and 30 impaired). The results are very similar to those obtained with the complete cohort in 
Table 2. 



Side 






Amygdala volume 


Amygdala segments 


Hippocampus volume 


Hippocampus segments 


ERC volume 


ERC segments 


Left 


p-Value 


> 2 


0.002 


0.0009 


0.024 


0.15 


0.0001 


0.0005 






> 3 


0.0001 


0.0003 


0.015 


0.02 


0.0005 


0.0001 




Avg. A (std.) 


> 2 


3.1 (2.3) 


2.4(1.4) 


2.8 (1.6) 


2.4(1.1) 


7.4 (3.2) 


8.5 (2.0) 






> 3 


2.3 (2.1) 


2.5(1.2) 


2.2 (1.5) 


2.4(1.1) 


5.5 (3.5) 


6.8 (2.3) 


Right 


p-Value 


> 2 


0.006 


0.0015 


0.25 


0.07 


0.002 


0.013 






> 3 


0.0005 


0.007 


0.18 


0.003 


0.005 


0.01 




Avg. A (std.) 


> 2 


2.8 (3.6) 


3.4 (2.1) 


3.0 (3.9) 


2.8 (1.3) 


9.9 (3.2) 


8.0 (3.2) 






> 3 


2.2 (3.5) 


3.1 (2.0) 


3.2 (4.6) 


2.2 (1.2) 


9.7 (3.2) 


7.9 (3.2) 



E-step: Since we are working with fixed A, introduce for conve- 
nience the variable 

«ki = S k (a kj -t k + A)H(a kj -t k + A) 
so that the model can be written as 

y hj = a + fia kj + 0u kj + yc k + d k + r] k + f kj 

(recall that several variables, including u kj , depend on the shape coordi- 
nate, q, even though this dependency is not made explicit). The joint 
log-likelihood is then given by 

\ n n k , 2 

^-^EE (y kj -a-pa kj -p u kj -yc k -8d k -r] k ) . 

1 A 2 N, 2 n. 2 
"V/S % "2 l0g(J -2 l0gP 



At the end of the algorithm, the likelihood of the observed variables 
is obtained by integrating with respect to the random effects. It is, still 
up to constant additive factors, and using the same notation for 
and t\ % given by the following expression: L = — ^ J2 k =i ((X "ii r kj ) 
— ml) — f logo -2 — §logp 2 + \ J2 k =-i log^. Remembering that we were 
working with fixed shape coordinate q and fixed A, this likelihood is ac- 
tually a function of both q and A. Denote it by L q (A). This likelihood is 
then optimized in A by computing it over a finite number of values, 
yielding an optimal A 9 , associated with parameters (a q , Pq,$ q ,y q ,8 q , 
Og,p q ). The optimal likelihood, L q (A q ), will be used below to assess 
model significance. 

Note that the optimization under the null hypothesis ($ q = 0) 
follows exactly the same approach, using the EM algorithm, except 
that, since the model under the null does not depend on A, there is 
no final optimization loop with respect to this parameter. This com- 
putation returns a likelihood under the null, that we will denote by 
] q 



where n is the number of subjects, n k the number of scans for subject k 
and N is the total number of scans. Let r kj denote the residual (y kj — a — 
p>a k j — p>'u k j — yc k — 8d k — r] k ). Conditional to observed variables y, a, u, 
c, and d, the hidden variable r\ k follows a Gaussian distribution with 
mean and variance 



a 2 + n k p 2 J= i 



(Er k/ ) y k = 



n k p 2 + a 2 * 



From this, it follows that the conditional likelihood is, up to an addi- 
tive constant: 

1 n n k , 2 

^cond = -r^EE (y kj -m k -a-pa kj -pu kj -yc k -8d k ) . 

1 A 2 1 A, 2 2, n. 2 N. 2 

-7r-nlZn k T k -—^J2{T k +m k )--\ogp - T logCT 



2a 2 k 



2p z k 



M-step: This likelihood is optimized with respect to the parame- 
ters in the M-step: a, /3, j8', 7, 6' are updated via linear regression of 
the shape variables y kj — m k against the other variables, which is ob- 
tained via standard least squares. The parameters o 2 and p 2 are then 
derived via 



2 1 " " fc 



m k -a-pa kj -p u kj -yc k -8d k ) 



1 n 2 
+ j-jJ2n k r k 



and 



1 n o o 



These values are then plugged back into the E-step for another iter- 
ation of the EM algorithm, which is run until stabilization. 



A 1 A. Tests for significance 

Our tests are based on the likelihood ratio statistic, which is comput- 
ed independently for each shape coordinate, q, and is given, with our 
previous notation, by 



-L q (A q 



A global statistic is then defined by 5 max = maxS q . To compute p- 
values, we use permutation tests, randomizing the model residuals, 
which are defined by 

4j =ylj- a q - /Vk/-y c k- d k 

where the parameters are estimated under the null hypothesis. 

Permutations operate over subjects, i.e., on the index k, so that scans 
that were associated to the same subject remain together. Let Xf". De- 
note the residuals after applying a permutation tt. We apply the estima- 
tion scheme developed in the previous section after replacing j/j^. by A^ n , 
computing new likelihood ratios Sj, and global statistic S^ ax . We repeat 
this for a large number, say M, of random permutations, and compare 
the obtained statistics to the one associated to the original AjJ ., i.e., 
Smax f° r TT = identity. If v is the number of permutations tt for which 
the associated p-value is given by 



the value of S n max excess 



v+1 
P ~M+1 " 

We used M = 10,000 in our experiments, making p = 0.0001 the 
smallest detectable p-value. 

Variables q for which sf is found larger that the 95th percentile of 
the values of that were observed via permutations are considered 
as significantly affected by group difference. These variables are detect- 
ed at a 5% family-wise error rate (FWER), which means that the proba- 
bility of making one or more false detections among them is less than 
5%. 
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As a final remark in this section, we mention that, since one of the is- 
sues of the EM algorithm is that its solution may depend on the initial- 
ization, we always initialize the estimation scheme for the alternate 
hypothesis with the one found with the null hypothesis. This ensures 
the positivity of the log-likelihood ratio statistics. 

A1.5. Accuracy assessment via bootstrap 

To evaluate the accuracy of the changepoint estimate, we performed 
a resampling procedure (bootstrap). Each resample step draws subjects 
from the original dataset with replacement, in order to obtain a new 
dataset with the same size. A new summary onset time, say A*, is esti- 
mated from this dataset. By repeating this procedure a large number 
of times, one obtains bootstrap estimates of the mean and standard de- 
viation of the estimator. This measures the accuracy of our results, and 
should not be confused with a population standard deviation of the 
onset time, since we estimate only one A for the whole cohort and can- 
not assess its variability across subjects (the small number of scans per 
subject make the computation of individual changepoints impossible). 
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